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Abstract 

A study of the substructure of jets with transverse momentum greater than 400 GeV/ c produced 
in proton-antiproton collisions at a center-of-mass energy of 1.96 TeV at the Fermilab Tevatron 
Collider and recorded by the CDF II detector is presented. The distributions of the jet mass, 
angularity, and planar flow are measured for the first time in a sample with an integrated luminosity 
of 5.95 fb _1 . The observed substructure for high mass jets is consistent with predictions from 
perturbative quantum chromo dynamics. 
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The study of high transverse momentum (px) massive jets produced in proton-antiproton 
(pp) interactions provides an important test of perturbative QCD (pQCD) and gives insight 
into the parton showering mechanism (see e.g., [TJ [2] for recent reviews). Furthermore, 
massive boosted jets constitute an important background in searches for various new physics 
models [3H6], the Higgs boson [7], and highly boosted top quark production. Particularly 
relevant is the case where the decay of a heavy resonance produces high-pr top quarks that 
decay hadronically. In all these cases, the hadronic decay products can be detected as a single 
jet with a large mass and internal substructure that differs on average from pQCD jets once 
the jet pt is greater than 400-500 GeV/ c. However, experimental studies of the substructure 
of high pt jets at the Tevatron have been limited to jets with px < 400 GeV/ c [HI IS]; recently 
results with higher px jets produced at the Large Hadron Collider have been published [TO] . 

Jets produced through QCD processes with large mass are expected to arise predomi- 
nantly through a process of single hard gluon emission from a high px quark or gluon [TTJ . 
The probability of this process is given by the jet function, J{m? et ,px, R), for which a simple 
next-to-leading-order (NLO) approximation is 

J( m ^ P x,R)c,a s (p T )^lo g (^^ , (1) 

where m? ei is the jet mass, a s {px) is the strong coupling, C q>9 = 4/3 and 3 for quark and gluon 
jets, respectively, and R is the cone radius used to define the jet [UJ. The approximation 
holds for m? et <C R-px- Although uncertainties from higher-order corrections are ~ 30%, it 
predicts both the shape of the spectrum and the fraction of jets with masses greater than 
about 100 GeV/ c 2 . Two other jet substructure variables insensitive to soft radiation at high 
jet mass are angularity and planar flow [T2HT6] . The angularity is defined as 

where the sum is over the constituents in the jet cluster, E{ is the energy and 9i is the angle 
of each constituent relative to the jet axis. It is sensitive to radiation near the edge of the 
cone and has a characteristic shape for QCD jets. Planar flow is defined as 

where Ai 2 are the eigenvalues of the two-dimensional moment matrix 
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in which p^k is the k component of the jet constituent's transverse energy relative to the jet 
axis, i.e. in one of the two directions that span the plane perpendicular to the jet direction. 
Jets with three of more energetic constituents, such as those arising from a boosted top 
quark, are more planar with Pf ~ 1, compared with massive QCD jets where the energy 
flow is along the line defined by the two final-state partons and Pf ~ 0. Both of these 
variables are perturbatively calculable. 

We report in this Letter the first measurement of the jet mass distribution for jets with 
Pt > 400 GeV/c produced in 1.96 TeV pp collisions at the Fermilab Tevatron Collider 
and recorded by the CDF II detector. We also measure for jets with masses greater than 
90 GeV/ c 2 their angularity and planar flow distributions. We use the Midpoint cone algo- 
rithm [T7j to reconstruct jets using the fastjet program [18J and the anti-kt algorithm [T9] . 
allowing for a direct comparison of cone and recombination algorithms. 

The CDF II detector [20] consists of a solenoidal charged particle spectrometer surrounded 
by a calorimeter and muon system. Charged particle momenta are measured over \r]\ < 1.1. 
The calorimeter covers the region \rj\ < 3.6, with the region \r)\ < 1.1 segmented into towers 
of size Arj x A0 = 0.11 x 0.26 [21]. The calorimeter system is used to measure jets and 
missing transverse energy ($t) defined as 

$T = ~Ys E T< (5) 

% 

where the sum is over the calorimeter towers with \v\ < 3.6 and is a unit vector perpendic- 
ular to the beam axis and pointing at the i th calorimeter tower. We also define Et — \$r\- 
The 4-momentum of a jet is the sum over the calorimeter towers in the jet, where each 
calorimeter tower is treated as a massless 4-vector, and the jet mass is obtained from the 
resulting 4-vector. 

We select events in a sample with 5.95 fb _1 integrated luminosity identified with an 
inclusive jet trigger requiring at least one jet with transverse energy (E T ) > 100 GeV, 
with the trigger becoming fully efficient for jets with E T > 140 GeV. Jet candidates are 
constructed with a Midpoint cone algorithm with cone radii of R = 0.4 and 0.7 and with 
the anti-kt algorithm with a distance parameter R = 0.7. Primary collision vertices are 
reconstructed using charged particle information. Events are required to have at least one 
high quality primary vertex with \z vtx \ < 60 cm. Events are also required to be well- 
measured by requiring that they satisfy a missing transverse energy significance requirement 
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of Smet < 10 GeV 1//2 , defined as 

Smet = -^=, (6) 

where the sum is over all calorimeter towers. We calculate for each jet the scalar sum of the 
Pt of the tracks associated with the jet cluster. Each jet is required to either have more than 
5% of its energy registered in the electromagnetic calorimeter or to have its summed track 
momentum be at least 5%. This criterion eliminates jet candidates arising from instrumental 
backgrounds. Furthermore, we restrict the jet candidates to have 0.1 < \r] d \ < 0.7, where rj d 
is the jet pseudorapidity in the detector frame of reference, to ensure optimal calorimeter 
and charged particle tracking coverage. We further require that the leading jet in the event 
have pt > 400 GeV/c. We observe 2699 events. 

The jet 4-momentum is corrected to take into account calorimeter energy response, which 
is known to a precision of 3% [22] for central calorimeter jets with p? > 400 GeV/ c. We have 
determined the uncertainty on calibration of the jet mass measurement by comparing the 
momentum flux of charged particles into three concentric regions of the calorimeter around 
the jet centroid with the corresponding calorimeter response. 

The number of interaction vertices (N vtx ) is a measure of the number of multiple in- 
teractions (MI), i.e. additional collisions in the same bunch crossing, and averages ~ 3 
in this sample. We make a data-driven correction for MI effects on the jet substructure 
variables [23]. We select a subset of events with a back-to-back dijet topology. We then 
define cones at right angles to the leading jet in azimuth of the same size as the jet cluster, 
and add the calorimeter towers in these cones to the jet 4-vector after rotation by 90° into 
the jet cone. The resulting average mass shift upward as a function of m? et is taken as 
the correction downward due to MI and the energy flow from the underlying event (UE) 
of the hard collision. We separately measure the UE correction by using only events with 
Nytx — 1- We correct the leading jet mass, m^ etl , for events with N vtx > 1 by the difference 
between the mass shift in multi-vertex events and the mass shift in single vertex events. 
The correction has an approximate \jm? etx behaviour and averages ~ 4 GeV/ c 2 for a jet 
cone size of R = 0.7. The jet mass correction for a cone size of R = 0.4 is ~ 0.5 GeV/ c 2 , 
consistent with the expected R A scaling [2]. In the following, we focus on results for R = 0.7 
Midpoint jets. 

To model the high p T processes, we used a pythia 6.216 calculation [16J of QCD jet 

9 



production generated with parton > 300 GeV/ c, using the Tune A [24] parameters for the 
underlying event and the CTEQ5L parton distribution functions (PDFs), followed by a full 
detector simulation. Based on a pythia calculation, we estimate W and Z boson production 
to contribute ~ 25 jets with masses between 60 and 100 GeV/c 2 , which is less than 5% of 
the number observed. However, top quark pair production can contribute to the jet mass 
region m? etx > 100 GeV/ c 2 where the expected QCD jet rate is much lower. We employ an 
approximate next-to-next-to-leading order (NNLO) calculation of the tt differential cross 
section [25] updated with the MSTW 2008 PDFs [2S] and a top quark mass of m top = 
173 GeV/ c 2 [27]. This yields a cross section for top quark jets with p T > 400 GeV/ c of 
4.6 fb. We used the pythia 6.216 generator to create a tt MC sample and applied the same 
selection requirements used to define the event sample. The estimated tt contribution to 
the data sample, normalized to the NNLO cross section, is 13 ± 4 events. 

Two-thirds of the tt events with a leading high p T jet would produce a recoil jet with 
a large jet mass (m^ et2 ) arising from the fully-hadronic decay of the recoil top quark. The 
remaining 1 1 events would have a recoil top quark that decays semileptonically, resulting in 
large $t and a recoil jet with lower pt and m? et2 . We reduce these backgrounds by rejecting 
events with m? et2 > 100 GeV/ c 2 or by making a more stringent $t requirement by rejecting 
events with S met > 4 GeV 1/2 . Approximately 25% (80%) of the tt (QCD) MC events 
survive these requirements. We observe 30 jets with m? et > 140 GeV/ c 2 and expect a tt 
contribution of at most three jets. 

In order to compare our results with QCD predictions, we correct the m? et distributions for 
effects of selection and resolution by an unfolding procedure, where we correct bin-by-bin the 
observed m? etl distribution by the ratio of the QCD pythia MC m? etl distribution without 
detector effects and the same distribution after measurement and selection effects have been 
included. This jet mass unfolding correction was derived for each jet algorithm separately, 
and the correction factors vary from 1.6 to 2.0 over the jet mass range > 70 GeV/c 2 . These 
corrections were verified through studies of the data and confirmed with MC calculations. 

We summarize briefly our estimates of the systematic uncertainties that affect the sub- 
structure observables. The overall jet mass scale at these energies is known to 2 (10) GeV/ c 2 
for jet masses of 60 (120) GeV/ c 2 , based on the jet energy scale uncertainty and the compar- 
ison of the calorimeter energy and track momentum measurements within the jet mentioned 
above. We assign an uncertainty on the MI correction of 2 GeV/c 2 , which is half of the 
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average correction. We assign a ~ 15% uncertainty on the jet mass unfolding correction 
due to modeling of the jet hadronization, the uncertainty arising from the selection, and 
MC statistical uncertainties. The hadronization uncertainty is conservatively determined by 
comparing the change in the correction when hadronization is turned off in the MC sam- 
ples. We estimate the PDF uncertainties on the pythia predictions by reweighting the MC 
events using the ±ler variations in the 20 eigenvectors describing the uncertainties in the 
PDFs [28]; the uncertainties on the jet mass, angularity and planar flow distributions are 
10% or less in all cases. 

We show in Fig. [T]a comparison of the unfolded m? etx distribution for a cone size R = 0.7 
with the analytic predictions for the jet function. This comparison, made for jet masses above 
70 GeV/c 2 , shows that the analytical prediction for quark jets describes approximately the 
shape of the distribution and fraction of jets but tends to over-estimate the rate for jet 
masses from 130 to 200 GeV/ c 2 . The better agreement of the quark jet function with data 
compared with that of the gluon is consistent with the pQCD prediction that ~ 80% of 
these jets arise from quarks [22]. Furthermore, the data and the pythia distributions are 
in reasonable agreement. We also compare in the inset figure the distributions obtained for 
the Midpoint and anti-kt algorithms. The anti-kt jets have a very similar mass distribution 
to the Midpoint jets. We find that 1.4 ± 0.3% of the Midpoint jets with p T > 400 GeV/c 
have m? etx > 140 GeV/c 2 . This is the first measurement of this rate, and allows us to 
constrain QCD predictions of this fraction, and provide the first measurement of the rate 
of backgrounds in a massive jet sample from QCD production of high p? light quarks and 
gluons. 

A key prediction of the NLO QCD calculation is that the distribution of angularities [121 
[13] of high mass jets has relatively sharp kinematical edges, with minimum and maximum 
values given by 

T™ n ~ {2/z)- 3 , r m 2 ax ~ zR 2 /2 3 , (7) 

with z = m? et /pt ■ We show in Fig. [2] the angularity distribution for the leading jet requiring 
that m? etx e (90, 120) GeV/c 2 . The requirement of a relatively narrow vn? etx window allows 
us to compare the observed distribution with the shape and kinematic endpoints predicted by 
pQCD. The pythia and QCD predictions are in good agreement with the data for Midpoint 
and anti-kt jets, although the small size of the jet sample after applying the mass criterion 
limits the statistical precision of the comparison. This further strengthens the interpretation 
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FIG. 1. The normalized jet mass distribution for Midpoint jets with pr > 400 GeV/c and 
\rj\ G (0.1,0.7). The uncertainties shown are statistical (black lines) and systematic (yellow bars). 
The theory predictions for the jet function for quarks and gluons are shown as solid curves and 
have an estimated uncertainty of ~ 30%. We also show the pythia MC prediction (red dashed 
line). The inset compares Midpoint (full black circles) and anti-kt (open green squares) jets. 
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FIG. 2. The angularity distribution for Midpoint jets with pr > 400 GeV/c and \rj\ S (0.1,0.7). 
We have applied cuts to reject tt events and required that m? etx G (90, 120) GeV/ c 2 . We also show 
the PYTHIA calculation (red dashed line) and the pQCD kinematic endpoints. The inset compares 
the distributions for Midpoint (full black circles) and anti-kt (open green squares) jets. 

that these massive jets arise from two-body configurations. The small number of jets below 
r™ 2 m arise from resolution effects. The PDF uncertainties on the pythia predictions are 
10%, and are shown in the figure. The results for jets with cone sizes of R = 0.4 are similar. 
Figure [3] shows the planar flow distribution for jets where the jet mass is required to be 
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FIG. 3. The planar flow distributions for Midpoint jets with pj- > 400 GeV/c and \rj\ E (0.1,0.7) 
after applying the top rejection cuts and requiring m? etx G (130,210) GeV/c 2 . We also show the 
pythia QCD (red dashed line) and tt (blue dotted line) jets, as well as the results from the two 
jet algorithms (inset). All distributions have been separately normalized to unity. We expect only 
~ 10% of the jets to arise from SM 1 1 production. 



in the range 130 — 210 GeV/ c 2 , relevant for jets arising from top quark decays. Comparisons 
with the pythia predictions are also shown for both QCD multi-jet and tt production. 
Although the data are in good agreement with the predictions from QCD, the comparison 
is statistically limited because of the small number of observed jets in this jet mass range. 
The PDF uncertainties on the pythia QCD predictions are 10%. The results for jets 
reconstructed with the Midpoint and anti-k t algorithms are in good agreement with each 
other and are consistent with the general expectation based on MC calculations [TT]. This 
study suggests that with higher statistics it will be possible to use the planar flow variable 
to discriminate high pt QCD and top quark jets independent of jet mass. 

In summary, we have measured for the first time the mass, angularity and planar flow 
distributions for jets with px > 400 GeV/ c using Midpoint and anti-kt jet algorithms. We 
find good agreement between pythia Monte Carlo predictions, the NLO QCD jet function 
predictions, and the data for the jet mass distribution above 100 GeV/ c 2 for Midpoint 
and anti-kt jets. The Midpoint and anti-kt algorithms have very similar jet substructure 
distributions for high mass jets. Our results show that the use of jet mass is an effective 
variable for separation of jets produced through QCD and through tt production, with a 
jet mass requirement of greater than 140 GeV/c 2 leaving only 1.4 ± 0.3% of the QCD jets. 

13 



We have also shown that the high mass jets coming from light quark and gluon production 
are consistent with two-body final states from a study of the angularity variable, and that it 
may be possible to use the planar flow variable to further reject high mass QCD jets. These 
results provide the first experimental evidence that validates the MC calculations employing 
jet substructure to search for exotic heavy particles. 
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